Sample names and data file paths visualized in this report:
sample1: /Users/kathrynwalters/Documents/RStudio Git Locations/2024_sBC_proteogenomics/input/Betacell_mergedBamsSorted.sortedByCoord.out.bam_results_RiboseQC
Per sample, the distribution of reads across different originating compartment (e.g. cytoplasmic and organellar footprints) and biotypes (e.g. CDS regions of protein coding genes) is shown.
Per sample, the distribution of read lengths is shown per originating compartment.
Per sample and originating compartment, read length and location distributions are shown.
For each sample, absolute number of reads and normalized read length distributions are shown.
Read count shows absolute read numbers; in Read count fraction the number of reads for each biotype sums up to 1.
Per read length, the read distribution for different biotypes is shown (stacked barplot). Read count shows absolute numbers; in Read count fraction, the number of reads for each read length sums up to 1.
Profiles of 5’ ends are displayed over a metagene plot aggregating signal over all covered transcripts. 5’end profiles are calculated with sub-codon resolution, and using binned transcript regions.
Different scaling methods can be applied to the calculated profiles. Profiles for individual read lengths (without scaling) can also be visualized.
Disclaimer:When comparing between samples, you might find differences in read lengths displayed, since read lengths are chosen for each sample individually.
Select a resolution (subcodon or bins): In case of subcodon resolution, read coverage is shown for the first 25nt after the transcription start site (TSS), 25nt before and 33nt after the start codon, 33nt from the middle of the CDS, 33nt before and 25nt after the stop codon, and the last 25nt before the transcription end site (TES). In case of bins, read coverage is shown for 50 bins between TSS and start codon (5’UTR), 100 bins for the CDS, and 50 after stop codon (3’UTR).
Select a scaling method: none (no scaling), log2 scaling, and z-score scaling.
Note: Select subcodon / none or bins / none scaling to display the read coverage for each read length separately.
Select a resolution (subcodon or bins): In case of subcodon resolution, read coverage is shown for the first 25nt after the transcription start site (TSS), 25nt before and 33nt after the start codon, 33nt from the middle of the CDS, 33nt before and 25nt after the stop codon, and the last 25nt before the transcription end site (TES). In case of bins, read coverage is shown for 50 bins between TSS and start codon (5’UTR), 100 bins for the CDS, and 50 after stop codon (3’UTR).
Select a scaling method: none (no scaling), log2 scaling, and z-score scaling.
Note: Select subcodon / none or bins / none scaling to display the read coverage for each read length separately.
Read lengths, as well as their individual offsets, are selected according to the parameters specified in the Ribo-seQC run.
Note: Not all samples and originating organelles might be displayed here. Please check the parameters used in the Ribo-seQC run.
The fraction of 5’ends (from Section 4.1) falling on the three possible frames is displayed, for each read length and organelle. Each data point represents one transcripts.
Cutoffs and frame statistics are shown for selected read lengths:
Based on the parameters indicated in the Ribo-seQC run, the following read lengths (with their offsets) were selected to infer P-sites positions.
Read coverage in form of P-site profiles is here displayed, with the same visualization options available in Section 4.2.
Note: Not all samples and originating organelles might be displayed here. Please check the parameters used in the Ribo-seQC run.
Select a resolution (subcodon or bins): In case of subcodon resolution, read coverage is shown for the first 25nt after the transcription start site (TSS), 25nt before and 33nt after the start codon, 33nt from the middle of the CDS, 33nt before and 25nt after the stop codon, and the last 25nt before the transcription end site (TES). In case of bins, read coverage is shown for 50 bins between TSS and start codon (5’UTR), 100 bins for the CDS, and 50 after stop codon (3’UTR).
Select a scaling method: none (no scaling), log2 scaling, and z-score scaling.
Note: Select subcodon / none or bins / none scaling to display the read coverage for each read length separately.
In order to reveal possible contaminating sequences, the top 50 mapping positions (using 5’ends) are listed, together with genomic feature annotation and nucleotide sequences.
The 50 genes with the highest read counts are listed below for (i) CDS regions of protein coding genes and for (ii) all genes.
Based on the P-site positions (Section 4.3), codon usage within CDS regions of protein coding genes is here shown. In addition, position-specific values are calculated for the first 11 codons of the CDS, 11 codons from the middle of the CDS, and for the last 11 codons of the CDS - those regions are referred to as start, middle, and stop, respectively.
Codon usage can be accessed with positional information, or summed up over all positions (Section 8 Bulk codon usage).
Codon counts shows codon occurences per each position; P-sites counts shows number of P-sites position mapping to each codon and position; P-sites per codon simply shows the ratio of P-sites counts over Codon counts. Same calculations are performed using A-sites (shifting P-sites +3nt) and E-site (shifting P-sites -3nt) positions. Such values are calculated for all read lengths, and also for individual read lengths (available in the full report). Different scaling methods are also available.
Note: The genetic code, which assigns amino acids to codons, can differ between organelles, species and originating genomes. Different scales are used for ATG/stop codons and other codons.
Note: Codon usage calculation is dependent on successful P-sites calculation.